Catbird Stats, LLC (KC); Florida International University (GK)
2023-11-16
There are multiple purposes for keeping tabular data in spreadsheets:
Relational database!
“All relational databases organize data into sets of interlinked tables.” -Thomer and Wickett 2020 (open access)
“A database is, in some sense, just a collection of tables, where there’s some value in the tables that allows them to be connected to each other (the ‘related’ part of ‘relational database’).” -Data Carpentry ‘Data Management with SQL for Ecologists’ workshop
Access
Oracle
MySQL
(of a well-built relational database)
“Front end” / “Back end”
data entry is (can be) human-friendly
data validation
data storage is computer-friendly
all the linkages happen without you having to think about them
(of a well-built relational database)
Queries - you can pull data back out in different ways
e.g., if you wanted the lat/long and habitat information associated with each individual sampling event or even individual fish
WITHOUT altering the original data
Not everybody has ready access to database expertise
Not every database is designed well
Good databases require thoughtful design, as well as ongoing maintenance
Long-term projects
Projects involving lots of complexity
Projects where consistency system-wide is important
not just for others, but for you and future you and future colleagues down the road
At its most basic: who, what, why, where, how
add details
Different ways - SWMP metadata, EML
Important thing is to capture the information - reformatting down the road is much easier than if you’d never written it down in the first place
2023 NERRS Meeting